Automating Data-Model Workflows at a Level-12 HUC Scale in a Distributed Computing Environment

نویسندگان

  • Lorne Leonard
  • Christopher J Duffy
چکیده

The HydroTerre web services provide the Essential Terrestrial Variable (ETV) datasets to create common hydrological models anywhere in the continental United States (CONUS). These services allow web users to download data for their own purposes in their own computing environment. The datasets are provided using standard Geographic Information System formats and the data transformation is dependent on the users’ own needs, goals, and computing environment. In this article, we demonstrate the feasibility of automating data-transformation workflows for United States Geological Survey level-12 Hydrological Unit Codes (HUC-12) to be consumed in hydrological models. The Penn State Integrated Hydrological Model (PIHM) is demonstrated here, but the workflows serve as a template for other models to adapt and become new services. The focus of this article is the data transformation process, not the model results. We want to demonstrate that workflows empower modelers to create hydrological models rapidly anywhere in the CONUS, and to contribute to a dynamic resource that records provenance of HUC-12 models. To do this, an explanation is required of both the hardware and software architecture because the way in which they are coupled is critical for web service performance. A demonstration of the feasibility to automate data-model workflows for CONUS HUC-12 catchments is discussed with the emphasis on reproducibility by using data-model workflows and distributed computing resources.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automating data-model workflows at a level 12 HUC scale: Watershed modeling in a distributed computing environment

The prototype discussed in this article retrieves Essential Terrestrial Variable (ETV) web services and uses data-model workflows to transform ETV data for hydrological models in a distributed computing environment. The ETV workflow is a service layer to 100's of terabytes of national datasets bundled for fast data access in support of watershed modeling using the United States Geological Surve...

متن کامل

Improving the palbimm scheduling algorithm for fault tolerance in cloud computing

Cloud computing is the latest technology that involves distributed computation over the Internet. It meets the needs of users through sharing resources and using virtual technology. The workflow user applications refer to a set of tasks to be processed within the cloud environment. Scheduling algorithms have a lot to do with the efficiency of cloud computing environments through selection of su...

متن کامل

Automating Workflows for Service Provisioning: Integrating

Workflows are the structured activities that take place in information systems in typical business environments. These activities frequently involve several database systems, user interfaces, and application programs. Tkaditional database systems do not support workflows 20 any reasonable extent. Usually human beings must intervene to ensure their proper execution. W e have developed an archite...

متن کامل

Automating Workflows for Service Provisioning: Integrating AI and Database Technologies

Workflows are the structured activities that take place in information systems in typical business environments. These activities frequently involve several database systems, user interfaces, and application programs. Traditional database systems do not support workflows to any reasonable extent: usually human beings must intervene to ensure their proper execution. We have developed an architec...

متن کامل

Automating model building in ligand-based predictive drug discovery using the Spark framework

Automation of model building enables new predictive models to be generated in a faster, easier and more straightforward way once new data is available to predict on. Automation can also reduce the demand for tedious bookkeeping that is generally needed in manual workflows (e.g. intermediate files needed to be passed between steps in a workflow). The applicability of the Spark framework related ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014